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Abstract 

We  consider  the  fitting  of  geometric  elements,  such  as  lines,  planes,  circles,  cones,  and 
cylinders,  in  such  a way  that  the  sum  of  distances  or  the  maximal  distance  from  the 
element  to  the  data  points  is  minimized.  We  refer  to  this  kind  of  distance  based  fitting  as 
orthogonal  distance  regression  or  ODR.  We  present  a separation  of  variables  algorithm 
for  h and  loo  ODR  fitting  of  geometric  elements.  The  algorithm  is  iterative  and  allows  the 
element  to  be  given  in  either  implicit  form  f(x,0)  = 0 or  in  parametric  form  x = g(t,  /3), 
where  ft  is  the  vector  of  shape  parameters,  x is  a 2-  or  3-vector,  and  s is  a vector  of 
location  parameters.  The  algorithm  may  even  be  applied  in  cases,  such  as  with  ellipses, 
in  which  a closed  form  expression  for  the  distance  is  either  not  available  or  is  difficult  to 
compute.  For  li  and  loo  fitting,  the  norm  of  the  gradient  is  not  available  as  a stopping 
criterion,  as  it  is  not  continuous.  We  present  a stopping  criterion  that  handles  both  the 
li  and  the  loo  case,  and  is  based  on  a suitable  characterization  of  the  stationary  points. 

1 Introduction 

Let  us  be  given  N points  {zi}F=1  G and  a geometric  object  S in 

• implicit  form  {x  : = 0}  with  a scalar  function  /,  or 

• parametric  form  x = g(t,  6)  with  a vector  function  g, 

where  the  shape  parameter  vector  f3  G C lies  within  a closed,  convex  subset  C of  Rm. 
Denote  by 

MP)  = inf{||^  - Xi ||2  : Xi  on  5} 
the  distance  of  the  point  z*  to  the  geometric  object  S.  Let 

be  the  distance  vector  with  norm 

m = \\m\\, 

where  ||0(/3)||  denotes  either  the  Zoo-norm 

$(/3)  = max(0!  <Pn{0)) 
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or  the  Zi-norm 

N 

$(/3)  = £&(/?)■ 

t=l 

We  consider  the  problem: 

Find  f3  E C and  points  {xi}^  on  S such  that  $(/?)  = ||</>(/3)||  is  minimal. 

If  the  minimum  is  attained,  each  function  (j)t(0)  = \\zi  — Xi\\2  is  minimal  for  the 
point  x i € S.  Then  Zi  — Xi  is  orthogonal  to  S for  interior  points  of  S,  hence  the  term 
“orthogonal  distance  regression”  or  “ODR”. 

Nonlinear  l\  ODR  problems  are  treated  in  Watson  [10,  12].  A survey  for  linear 
problems  is  given  in  Zwick  [13]. 

As  stated,  the  problem  has  dimension  Nd  + m.  In  typical  metrology  applications,  the 
data  set  is  very  large  so  that  a direct  approach  to  the  problem  becomes  computationally 
expensive.  We  use  a separation  of  variables  algorithm  that  was  used  in  [2,  4]  and  Turner 
[9]  for  the  I2  ODR  problem.  Each  iteration  of  our  algorithm  consists  of  two  steps.  In  the 
first  step,  the  foot  points  {2^}^  on  S,  i.e.,  the  location  parameters,  are  calculated  for 
a fixed  parameter  vector  0.  These  d-dimensional  subproblems  can  be  efficiently  handled 
by  trust  region  methods  [3] . 

In  the  second  step,  a first  order  approximation  of  f>i(0)  is  employed,  that  can  be  given 
without  explicit  knowledge  of  the  dependence  of  the  optimal  points  Xi{0)  on  0.  At  this 
stage,  the  norm  of  the  correction  to  the  parameter  vector  0 is  limited  by  a trust  region 
strategy.  The  correction  can  be  computed  by  solving  a linear  programming  problem. 
For  general  nonlinear  minimax  problems  such  methods  were  proposed  in  Madsen  and 
SchjjER— Jacobsen  [6],  Hald  and  Madsen  [1]  and  Jonasson  and  K.  Madsen  [5]. 

Our  convergence  analysis  follows  the  general  approach  given  in  Powell  [8]  and  More 
[7].  But  in  order  to  handle  the  li  and  case  we  cannot  use  the  norm  of  the  gradient 
as  a stopping  or  convergence  criterion,  since  the  gradient  is  not  continuous.  Moreover,  a 
neccessary  condition  for  a minimum  is  that  the  subgradient  contains  the  zero  functional, 
see,  e.g.,  Watson  [11].  In  order  to  overcome  this  difficulty,  we  introduce  a replacement 
for  the  norm  of  the  gradient  that  serves  both  as  a stopping  criterion  and  as  an  essential 
tool  in  the  convergence  proof. 

2 The  trust  region  algorithm 

At  each  iteration  of  our  algorithm  we  solve  the  low-dimensional  subproblems  (Pi)  for 
0 = 0 k for  each  fixed  i,  i = 1, . . . , N: 

Minimize  || z*  — ®j||2  subject  to  f(xi,0)  =0  or  Xi  — g(ti,0). 

In  order  to  apply  the  trust  region  method  to  l\  and  ODR  we  need  a first  order 
approximation  ipi(0,a ) to  4>i{0).  With  appropriate  regularity  assumptions,  this  can  be 
computed  without  knowledge  of  the  dependence  of  the  optimal  points  Xi(0)  on  0 ([2], 
[4]).  This  means  that  the  iterative  improvement  in  0 is  uncoupled  from  the  calculations 
of  Xi(0),  whereby  a true  first  order  approximation  of  the  objective  function  is  attained. 
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In  the  case  of  the  implicit  form  f(x,0 ) = 0,  the  first  order  approximation  <pi  (/3  + a ) = 
ipi  (0,  a)  + o(a)  is  given  by 


0)T{zi  ~ *i)  + ^pf(xi,0)Ta 

l|Vx/(*i,/3)||2 


(2.1) 


as  a first  order  approximation  to  the  signed  distance  ±c),  (.6  + a).  For  the  parametric 
form  x = g(t , (3),  we  have 


M0ia)  = \\zi  ~ xih  - j1 — ^\-D0g{Xi,(3)a.  * (2.2) 

Note  that  (2.1)  makes  sense  even  for  points  on  the  surface.  For  an  orientable  hypersurface 
in  parametric  form,  the  expression  m (2-2)  should  be  replaced  by  the  unit  normal 

for  points  on  the  surface. 

Denote  by 

i>{(3)  = (Vh03),---,tM/3))T 

the  vector  of  the  linearized  distances  and  let 


*((3,a)  = \\m<x)\\-  urn  • 

The  main  algorithm: 

• Step  0:  An  initial  0O  £ Mm , a trust  region  radius  A0  > 0,  and  constants  0 < /i  < 1 
and  0<7<1<7,  A are  given.  Set  k — 0. 

• Step  1:  Minimize  \k(/3fc,a)  subject  to  ||o||2  < An-  and  0k  + a e C.  Let  ak  denote 
the  solution  with  minimal  norm. 

• Step  2:  If  Qfe  = 0,  stop. 

• Step  3:  Compute 

*(&  + «*)-*(&) 

Pk  *(0k,ak) 

• Step  4: 

(1)  Successful  step.  If  Pk  > M set 

/?k+l  = /3fc  + 

and  choose  Afc+i  such  that 

A k < Ajt+i  < min(7Afc,A).  (2.3) 

(2)  Unsuccessful  step.  Otherwise,  set 

0k+ 1 = 0k  and  0 < Afc+i  < 7Afc. 

• Step  5:  Increment  k by  one  and  go  to  Step  1. 

3 Global  convergence 

In  an  abstract  setting  our  problem  may  be  formulated  as 
Minimize  <&(/?)  = ||  </>(/?)  ||  on  a closed,  convex  set  C. 


Fitting  of  geometric  elements 


165 


To  solve  this  problem,  at  each  stage  of  the  iteration  we  solve  the  following  constrained, 
linearized  problem: 

Minimize  W(/3,  a)  subject  to  0 + a € C and  ||a||  < A. 

In  order  to  get  the  linearization  in  our  case,  we  solve  the  least  distance  subproblems 
(Pi),  i = 1, . . . , N,  with  a shape  parameter  (3,  and  use  (2.2),  or  (2.1). 

For  the  purpose  of  characterizing  stationary  points,  we  introduce  the  quantity 

Vi(/3)  = -inf{¥(/3,6)  | INI  <1,  0+ a € C}. 

Note  that  V\(0)  > 0,  since  '!'(/?,  0)  = 0. 

By  convexity,  Vi(/3)  = 0 implies  that  a = 0 is  a solution  of  the  linearized  minimization 
problem.  Madsen  and  Schjter- Jacobsen  [6]  have  shown  that  the  latter  condition  is 
equivalent  to  a condition  given  therein  for  the  functional  to  have  a stationary  point.  In 
order  to  prove  Theorem  3.3  we  prove  a lemma  that  was  given  in  a similar  form  for  the 
loo  case  in  Madsen  and  SchJvER-Jacobsen  [6]  and  Jonasson  and  Madsen  [5]).  We 
give  a different  proof  that  is  applicable  to  both  the  l\  and  lex,  cases. 

Lemma  3.1  Let  Vi(/3)  > e and  A < A.  For  the  solution  of  the  linearized  problem  the 
estimate 

tf(/3,a)  < -CeA  (3.1) 

holds,  with  a constant  that  depends  only  on  e and  A. 

Proof:  According  to  the  definition  of  Vi(/9)  and  the  continuity  of  there  exists  a 
feasible  au  with  ||ax||  < 1 such  that 

V{0,oti  ) = — c. 

Let  a = ta\,  where  t = min(l,  A).  Since  H 1(0,  a)  is  a convex  function,  we  get 
$( 0 , a)  < (1  - t)V(0, 0)  + tV{0,  Qi)  = -te. 

Since 

t > Amin(l,  1/A) 

we  get  the  conclusion  with  C = min(l,  1/A).  □ 

Proposition  3.2  For  a minimum  point, 

Vi(/3)  = 0 

holds. 

Proof:  Assume  the  contrary,  then  Vi(/?)  = e > 0 holds.  According  to  the  definition  of 
^(0,  a)  we  have 

$(/?  + a)  = $(/?)  + *(J3,  a)  + o(a). 

By  Lemma  3.1,  we  can  find  an  a with  ||a||  < A such  that  (3.1)  holds.  As  in  the  proof  of 
the  Lemma,  we  may  conclude  that 

<f>(/3  + ta ) < $(/3)  — CetA  + o(ta) 

for  0 < t < 1.  If  we  let  t — > 0 we  get  a contradiction  to  the  minimum  property.  □ 
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Theorem  3.3  Either  the  algorithm  ends  in  a finite  number  of  steps,  or  a sequence  ft 
is  generated  for  which  liminffc_0o  Vi  (ft)  = 0. 

Proof:  Assume  the  contrary.  Then  there  exists  e > 0 such  that  Vj(ft)  > e holds  for 
all  k.  By  the  definition  of  pk  and  the  lemma,  it  follows  that  for  a successful  step 

' <Kft+i)  < 4>(Pk)  - pCeAk 

and  by  the  updating  rule  for  Ajt+i  we  get 

Aa-+i  < c(^>(ft+i)  - 0(ft-)), 

with  c = 1 /(pCe).  Combining  this  inequality  with  the  updating  rule  for  an  unsuccessful 
step  yields 

Afc+i  < 7^/t + c(«jf>(ft+i)  -<t>(0k)). 

By  summation  and  the  monotonicity  of  fiifik)  it  follows  that  for  all  N 


Since  this  implies  the  convergence  of  )T  A*.,  we  get  lim  A*  = 0.  From  ||ft.||  < Ak  we 
obtain  the  convergence  of  ft.  From  the  definition  of  pk  it  then  follows  that  limp/,,  = 1. 
But  then  the  updating  rule  (2.3)  implies  that  eventually  A*+i  > A*,  which  gives  a 
contradiction.  □ 

Theorem  3.4  (Global  Convergence,  cf.  More  [7],  Powell  [8])  Assume  that  Vi  (ft 
is  uniformly  continuous.  Then  either  the  algorithm  ends  in  a finite  number  of  steps,  or 
a sequence  ft  is  generated  for  which 

lim  Vi(ft)  = 0. 

k—*oc 

Proof:  Assume  the  contrary.  Then  there  exists  an  ei  such  that  for  each  kg  there  exists 
a k > ko  with 

Vl(ft)  > €i. 

By  Theorem  3.3  we  can  find  an  index  l > k such  that 

V1(ft)<e1/2 

(ko  will  be  determined  later).  We  choose  the  smallest  such  /.  As  in  the  proof  of  Theorem 
3.3,  it  follows  that  for  that  a successful  step  with  k <i  < l, 

llft+r  - All  < A a,  < 2c\  (4>(fii)  - m+ 1)). 

Clearly,  this  also  holds  for  an  unsuccessful  step.  This  yields 

IIA  - ftll  < 2Cl(<j&(ft)  - 0(A))- 

Since  </>(ft)  converges  by  monotonicity,  we  can  make  ||ft  - ft||  arbitrarily  small  for  large 
enough  ko  ■ By  the  uniform  continuity  of  V i (0)  we  infer 

|V1(ft)-V1(ft)|<e1/2, 

which  is  a contradiction.  □ 
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4 A numerical  example 

As  an  illustrative  example,  we  fit  an  ellipse  to  data,  given  as  coordinate  pairs  in  K2.  There 
are  24  data  points  and  five  components  to  the  shape  parameter  vector  (i.e.,  n = 2,d  = 
2,m  = 5,  N = 24).  We  used  a standard  parameterization  involving  a center  (xo,yo),  the 
axes  (a,  b),  and  a rotation  angle  6.  ■ 

The  output  is  shown  below.  The  initial  values  for  the  parameters  and  the  obtained 
parameters  in  three  different  norms  are  given  in  Table  1.  In  the  l2  case,  we  give  as  the 
error  the  root  mean  square  error,  in  the  l\  case  the  mean  absolute  deviation,  and  in  the 


Fig.  1.  l2,  h,  and  loo- Approximation. 


x0 

Xi 

a 

b 

0 (degrees) 

Error 

Initial 

values 

0.4989881 

-1.4262126 

4.6719913 

0.4364267 

20.75913 

h 

0.6637511 

-1.3987826 

5.5124671 

0.3376480 

20.90124 

0.11520 

h 

0.5368646 

-1.4465520 

5.2778061 

0.3358224 

20.88869 

0.09047 

^OO 

0.7694412 

-1.3829474 

4.9731226 

0.4491259 

20.66893 

0.23489 

Tab.  1. 

Parameters  for  different 

norms. 
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The  number  of  iterations  in  each  case  was  five  or  six.  We  note  that  the  deviations  for 
the  best  fit  li  and  ellipses  exhibit  behavior  typical  to  these  norms:  five  of  the  data 
points  lie  on  the  best  fit  l\  ellipse  and  there  are  six  deviations  of  largest  magnitude  in 
the  loo  case. 
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